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Abstract 

The unitary polar factor Q = U p in the polar decomposition of Z = U p H is the 
minimizer for both || Log(Q* Z)\\ 2 and its Hermitian part || sym t (Log(Q* Z))\\ 2 over both 
K and C for any given invertible matrix Z £ {J3 nxn anc [ an y matrix logarithm Log, not 
necessarily the principal logarithm log. We prove this for the spectral matrix norm in 
any dimension and for the Frobenius matrix norm in two and three dimensions. The 
result shows that the unitary polar factor is the nearest orthogonal matrix to Z not 
only in the normwise sense, but also in a geodesic distance. The derivation is based on 
Bhatia's generalization of Bernstein's trace inequality for the matrix exponential and 
a new sum of squared logarithms inequality. Our result generalizes the fact for scalars 
that for any complex logarithm and for all z G C \ {0} 

min | Log c (e -M? ,z)| 2 = | log |z|| 2 , min | 9le Log c (e~^z)\ 2 = | log \z\ \ 2 . 
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1 Introduction 



1.1 The polar decomposition 

Every matrix Z G C mxn admits a polar decomposition 

Z=U P H, 

where the unitary polar factor U p has orthonormal columns and H is Hermitian positive 
semidefmite [2], [TTt Ch. 8]. The decomposition is unique if Z has full column rank. In the 
following we assume that Z is an invertible matrix, in which case H is positive definite. The 
polar decomposition is the matrix analog of the polar form of a complex number 

z = e i^g(z). r ^ r _ | 2 | > Q 5 _7r < arg(» < 71 . 

The polar decomposition has a wide variety of applications: the solution to the Euclidean 
orthogonal Procrustes problem minQ G u(n) \\Z — BQ\\ 2 F is given by the unitary polar factor of 
B*Z [UJ Ch. 12], and the polar decomposition can be used as the crucial tool for computing 
the eigenvalue decomposition of symmetric matrices and the singular value decomposition 
(SVD) [30]. Practical methods for computing the polar decomposition are the scaled New- 
ton iteration [11] and the QR-based dynamically weighted Halley iteration [28], and their 
backward stability is established in [2"9~| . 

The unitary polar factor U p has the important property [HI Thm. IX. 7.2], [T3], [TTJ p. 197], 
[20~| p. 454] that it is the nearest unitary matrix to Z e C nxn , that is, 

min \\Z-Qf = min \\Q*Z - if = \\U*Z - if = II \ r Z 7 Z - if , (1.1) 

QeU(n) " QeU(n) P 

where || • || denotes any unitarily invariant norm. For the Frobenius matrix norm this 
optimality implies for real Z e M. nxn and the orthogonal polar factor [23] 

V Q E 0(n) : tr (Q T Z) = (Q, Z) < (U p , Z) = tr (U^Z) . (1.2) 

In the complex case we similarly have [20l Thm. 7.4.9, p. 432] 

VQeU(n): mttr(Q*Z) < fHctr (U*Z) . (1.3) 

For invertible Z G GL + (n, R) and the Frobenius matrix norm ||- it can be shown that [9|[24^ 

min fx\\symXQ T Z-I)\\l + fi c \\skewXQ T Z-I)f F = fx\\U^Z-lf F , (1.4) 

QeO(n) 

for n c > fx > 0. Here, sym„(X) = |(X*+X) is the Hermitian part and skew„(X) = \{X— X*) 
is the skew- Hermitian part of X. The family (II .4p appears as strain energy expression in 
geometrically exact Cosserat extended continuum models [231 E21 EH EH EE] • 
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Surprisingly, the optimality (jl.4p of the orthogonal polar factor ceases to be true for 
< /i c < A 4 - Indeed, for /x c = one can show that [36] there exist Z £ M 3x3 such that 

min \\ S ymXQ T Z-I)\\ 2 F < \\UjZ-I\\ 2 F . (1.5) 

By compactness of SO(n) and continuity of Q h- >■ || sjm^Q T Z — I\\ F it is clear that the 
minimum in (II .5p exists. Here, the polar factor U p of Z £ GL + (3, M) is always a critical point, 
but is not necessarily (even locally) minimal. In contrast to the term || sym, is not 

invariant w.r.t. left-action of S0(3) on X, which does explain the appearance of nonclassical 
solutions in (11.51) since now 

|| sym,Q T Z - lf F = \\ sym*Q T Z\\ 2 F - 2tr (Q T Z) + 3 (1.6) 

and optimality does not reduce to optimality of the trace term ( II. 2p . The reason there is no 
nonclassical solution in ( 11 .4ft is that for \x c > fi > we have 

min n || symXQ T Z - 7)|||, + y, c \\ skew t (Q T Z - I)\\ 2 F (1.7) 

QeO(n) 

> min ^||sym»(Q T Z- I)\\l + skew (Q T Z - I)\\ 2 F = min ^\\Q T Z-I\\ 2 F . 

QGO(n) QeO(n) 



1.2 The matrix logarithm minimization problem and results 

Formally, we obtain our minimization problem by replacing the matrix Q T Z — I by the 
matrix Log(Q T Z) in ( 11. 4ft . Then, introducing the weights /i,/i c > we embed the problem 
in a more general family of minimization problems at a given Z £ GL + (?i,IR) 

min /j, \\ sym t Log(Q T Z)\\ F + \\skew t Log(Q T Z)\\ F , n > 0, fi c > . (1.8) 

QeSO(n) 

For the solution of fl 1.8ft we consider separately the minimization of 

min || Log(Q*Z)\\ 2 , min || sym.Log(Q*Z)|| 2 , (1.9) 

QeU(n) QeU(n) 

on the group of unitary matrices Q £ U(n) and with respect to any matrix logarithm Log. 

We show that the unitary polar factor U p is a minimizer of both terms in ( 11.91) for both 
the Frobenius norm (dimension n = 2, 3) and the spectral matrix norm for arbitrary n £ N, 
and the minimum is attained when the principal logarithm is taken. 

Finally, we show that the minimizer of the real problem ( II. 8p for all \x > 0, fi c > is also 
given by the polar factor U p . Note that sym t (Q T Z — I) is the leading order approximation 
to the Hermitian part of the logarithm sym„LogQ T Z in the neighborhood of the identity I, 
and recall the non-optimality of the polar factor in ( 11. 5ft for \i c — 0. The optimality of the 
polar factor in (II. 9p is therefore rather unexpected. Our result implies also that different 
members of the family of Riemannian metrics gx on the tangent space (11.221) lead to the 
same Riemannian distance to the compact subgroup SO (3), see [35] . 
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Since we prove that the unitary polar factor U p is the unique minimizer for ( 11. 8p and 
( ll.9p i in the Frobenius matrix norm for n < 3, it follows that these new optimality properties 
of U p provide another characterization of the polar decomposition. 

In our optimality proof we do not use differential calculus on the nonlinear manifold 
SO(n) for the real case because the derivative of the matrix logarithm is analytically not 
tractable. However, if we assume a priori that the minimizer G SO(n) can be found 
in the set {Q G SO(n) | \\Q T Z — I\\ F < q < 1 }, we can use the power series expansion of 
the principal logarithm and differential calculus to show that the polar factor is indeed the 
unique minimizer (unpublished). 

Instead, motivated by insight gained in the simple complex case, we first consider the 
Hermitian minimization problem, which has the advantage of allowing us to work with the 
positive definite Hermitian matrix exp sym t Log Q*Z. A subtlety that we encounter several 
times is the possible non-uniqueness of the matrix logarithm Log. Our optimality result 
relies crucially on unitary invariance and a Bernstein-type trace inequality [3] 

tr(expX expX*) < tr (exp (X + X*)) , (1.10) 

for the matrix exponential. Together, these imply some algebraic conditions on the eigen- 
values in case of the Frobenius matrix norm, which we exploit using a new sum of squared 
logarithms inequality jS]. The case of the spectral norm is considerably easier. 

This paper is organized as follows. In the remainder of this section we describe an 
application that motivated this work. In Section [2] we present two-dimensional analogues to 
our minimization problems in both, complex and real matrix representations, to illustrate 
the general approach and notation. In Section [3] we collect properties of the matrix logarithm 
and its Hermitian part. Section H] contains the main results where we discuss the unitary 
minimization (jl.9p . From the complex case we then infer the real case in Section \5\ and 
finally discuss uniqueness in Section [6j 

Notation. t7j(X) = \J Aj(X*X) denotes the z-th largest singular value of X. \\X\\ 2 = 

o~i(X) is the spectral matrix norm, ||X||_f = \jYHj=i \-^ij\ 2 * s the Frobenius matrix norm 

with associated inner product {X, Y) = tr (X*Y). The symbol / denotes the identity matrix. 
An identity involving || ■ || without subscripts holds for any unitarily invariant norm. To avoid 
confusion between the unitary polar factor and the singular value decomposition (SVD) 
of Z = ITEV*, Up with the subscript p always denotes the unitary polar factor, while U 
denotes the matrix of right singular vectors. Hence for example Z = U P H = UHV*. U(n), 
O(n), GL(n, C), GL + (n, M), SL(n) and SO(n) denote the group of complex unitary matrices, 
orthogonal real matrices, invertible complex matrices, invertible real matrices with positive 
determinant, the special linear group and the special orthogonal group, respectively. The 
set so(n) is the Lie-algebra of all n x n skew-symmetric matrices and si(n) denotes the 
Lie-algebra of all n x n traceless matrices. The set of all n x n Hermitian matrices is H(n) 
and positive definite Hermitian matrices are denoted by F(n). We let sym.X = \{X* + X) 
denote the Hermitian part of X and skew^X = |(X — X*) the skew-Hermitian part of X 
such that X = sym„X + skew^X. In general, LogZ with capital letter denotes any solution 
to exp X = Z, while log Z denotes the principal logarithm. 
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1.3 Application and practical motivation for the matrix logarithm 



In this subsection we describe how our minimization problem including the matrix loga- 
rithm arises from new concepts in nonlinear elasticity theory and may find applications in 
generalized Procrustes problems. Readers interested only in the result may continue reading 
Section [2j 



1.3.1 Strain measures in linear and nonlinear elasticity 

Define the Euclidean distance dist 2 lid (X, F) := \\X — Y\\ 2 F , which is the length of the 

2 

line segment joining X and Y in W 1 . We consider an elastic body which in a reference 
configuration occupies the bounded domain Q C M. 3 . Deformations of the body are prescribed 
by mappings 

p-.Q^M 3 , (1.11) 

where (p(x) denotes the deformed position of the material point igSI. Central to elasticity 
theory is the notion of strain. Strain is a measure of deformation such that no strain means 
that the body Q has been moved rigidly in space. In linearized elasticity, one considers 
ip(x) = x + u(x), where u : Q C M 3 H > IR 3 is the displacement. The classical linearized strain 
measure is e := sym,VM. It appears through a matrix nearness problem 

dist Lnd(Vn,so(3)) := min || Vu - W\\ 2 F = || sym, Vuf F . (1.12) 

W&so(S) 

Indeed, sym.Vw qualifies as a linearized strain measure: if dist 2 uclid (Vu, so(3)) = then 
u(x) = W.x + b is a linearized rigid movement. This is the case since 

diste UC iid(Vw(x),so(3)) = Vu(x) = W(x) G so(3) (1.13) 



and = Curl Vu(x) = Curl W (x) implies that W(x) is constant, see 

In nonlinear elasticity theory one assumes that Vy? 6 GL + (3, M) (no self-interpenetration 
of matter) and considers the matrix nearness problem 

dist c 2 uclid (V^, SO(3)) := min Wip- Q\\ 2 F = min \\Q T Vp - lf F . (1.14) 

Q6SO(3) QeSO(3) 

From ( 11. ip it immediately follows that 

dist c 2 ucM (V^,SO(3)) = yVy T Vy-I\\ F . (1.15) 

The term a/ Vy2 T vV is called the right stretch tensor and yV^V^ — I is called the Biot 
strain tensor. Indeed, the quantity \/'Vip T 'V(p — I qualifies as a nonlinear strain measure: if 
dist 2 uclid ( V<£>, SO(3)) = then (p(x) = Q.x + b is a rigid movement. This is the case since 

dist e 2 uc iid(V^, SO(3)) = Vip{x) = Q(x) E SO(3) (1.16) 
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and = CurlVy^x) = CurlQ(x) implies that Q(x) is constant, see [37]. Many other 
expressions can serve as strain measures. One classical example is the Hill-family [TSJ dHJ EH] 
of strain measures 



a m (Vy?) : = 




(1.17) 



The case m = is known as Hencky's strain measure [16]. Note that a m (I + Vtt) = 
sym,V« + ... all coincide in the first-order approximation for all m G M. 

In case of isotropic elasticity the formulation of a boundary value problem of place may 
be based on postulating an elastic energy by integrating an SO(3)-bi-invariant (isotropic and 
frame-indifferent) function W : M 3x3 h )■ R of the strain measure a m over Q 

£(ip) := [ W(a m (Vip))dx, ip{x)r D = <p (x) (1-18) 

and prescribing the boundary deformation </? on the Dirichlet part C cKX The goal 
is to minimize £(<p) in a class of admissible functions. For example, choosing m — 1 and 
W^(otm) = A* ll a m.||F + f (tr (a m )) 2 leads to the isotropic Biot strain energy [36] 

/ ^\WV V T V^-I\\ 2 F + - (tr (VV^V^-JH dx (1.19) 

with Lame constants /x, A. The corresponding Euler-Lagrange equations constitute a nonlin- 
ear, second order system of partial differential equations. For reasonable physical response 
of an elastic material Hill [T8l [T9l 139] has argued that W should be a convex function of the 
logarithmic strain measure ao(V^) = log A/Vy? T Vy?. This is the content of Hill's inequality. 
Direct calculation shows that a>o is the only strain measure among the family (11.171) that has 
the tension- compression symmetry, i.e., for all unitarily invariant norms 

||ao(V V (x)- 1 )|| = ||a (V^(x))||. (1.20) 

In his Ph.D thesis [21] the first author was the first to observe that energies convex in the 
logarithmic strain measure ao(Vy) are not rank-one convex. However, rank-one convexity 
is true in a large neighborhood of the identity |10j . 

Assume for simplicity that we deal with an elastic material that can only sustain volume 
preserving deformations. Locally, we must have det V(f(x) = 1. Thus, for the deformation 
gradient V<^(x) G SL(3). On SL(3) the straight line X + t(Y - X) joining X, Y G SL(3) 
leaves the group. Thus, the Euclidean distance distg Uclid (Vy?, SO(3)) does not respect the 
group structure of SL(3). 

Since the Euclidean distance (11.151) is an arbitrary choice, novel approaches in nonlinear 
elasticity theory aim at putting more geometry (i.e. respecting the group structure of the 
deformation mappings) into the description of the strain a material endures. In this context, 
it is natural to consider the strain measures induced by the geodesic distances stemming from 
choices for the Riemannian structure respecting also the algebraic group structure, which we 
introduce next. 
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1.3.2 Geodesic distances 

In a connected Riemannian manifold Ai with Riemannian metric g, the length of a contin- 
uously differentiable curve 7 : [a, b] i-> M. is defined by 

L{l):= f A w ( 7 (*),7(s))ds. (1-21) 

At every IeM the metric gx '■ TxAi x Tx-M. 1— >■ M is a positive definite, symmetric bilinear 
form on the tangent space Tx-M.- The distance dist geo d i A^(^, Y) between two points X and Y 
of M. is defined as the infimum of the length taken over all continuous, piecewise continuously 
differentiable curves 7 : [a, b] 1— y Ai such that 7(a) = X and 7(6) = V. With this definition 
of distance, geodesies in a Riemannian manifold are the locally distance-minimizing paths, 
in the above sense. Regarding Ai = SL(3) as a Riemannian manifold equipped with the 
metric associated to one of the positive definite quadratic forms of the family 

V/i,/i c >0: ( 7x (e,0: = ^l|sym(X- 1 0||| + /i c ||skew(X- 1 0|||, £gT x SL(3), (1.22) 

we have 7 -1 (t)7(£) € TjSL(3) = £1(3) (by direct calculation, £l(3) denotes the trace free 
M 3x3 -matrices) and 

^ w (7(t),7W)=/i||sym( 7 - 1 (t)7W)||| + /ic||skew( 7 - 1 (t)7W)||^ (1-23) 
It is clear that 

V/i,/i c >0: [i ||symF|||. + /i c ||skewy||^ (1-24) 

is a norm on sl(3). For such a choice of metric we then obtain associated Riemannian distance 
metric 

dist gcod , SL (3)(^) = inf{L(7), l(a)=X, 1 (b) = Y}. (1.25) 

This construction ensures the validity of the triangle inequality [221 P-14]. The geodesies on 
SL(3) for the family of metrics fll.22p have been computed in [25] in the context of dissipation 
distances in elasto-plasticity. 

With this preparation, it is now natural to consider the strain measure induced by the 
geodesic distance. For a given deformation gradient G SL(3) we thus compute the 
distance to the nearest orthogonal matrix in the geodesic distance (I1.25P on the Riemanian 
manifold and matrix Lie-group SL(3), i.e., 

distg CodiSL(3) (V^,SO(3)) := ^miri 3) distJ eod)SL(3) (V^,g) . (1.26) 

It is clear that this defines a strain measure, since dist geodSL ( 3 -)(V</?(^), SO(3)) = implies 
V^(x) G SO (3), whence ip(x) = Qx + b. Fortunately, the minimization on the right hand 
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side in (TOg) can be carried out although the explicit distances dist geod ) sl(3) 

(V<p,Q) for 

Q G SO (3) remain unknown to us. In it is shown that 

Here and throughout, by LogZ we denote any matrix logarithm, one of the many solutions 
X to expX = Z. By contrast, logZ denotes the principal logarithm, see Section I3~3l The 
last equality constitutes the basic motivation for this work, where we solve the minimization 
problem on the right hand side of ( 11.27) and determine thus the precise form of the geodesic 
strain measure. As a result of this paper it turns out that 

dist^ od)SL(3) (V^,SO(3)) = HlogVV^V^III, (1.28) 

which is nothing else but a quadratic expression in Hencky's strain measure ( 11.17) and there- 
fore satisfying Hill's inequality. 

Geodesic distance measures have appeared recently in many other applications: for ex- 
ample, one considers a geodesic distance on the Riemannian manifold of the cone of positive 
definite matrices P(n) (which is a Lie-group but not w.r.t. the usual matrix multiplication) 
[3 127] given by 

dis4 od , P(n) (P 1 ,P 2 ) := || log(Pr 1/2 P 2 Pr 1/2 )H 2 F- (1-29) 
Another distance, the so-called log-Euclidean metric on P(n) 

dist 1 2 ogiCUclidiP(n) (Pi,P 2 ) :=|| logP 2 - log Pi |||. 

(in general + \\ \o % {P^P 2 )f F = dist? og!euclid , P(n) (Pf X P 2 , /)) (1.30) 

is proposed in [TJ. Both formulas find application in diffusion tensor imaging or in fitting of 
positive definite elasticity tensors. The geodesic distance on the compact matrix Lie-group 
SO(n) is also well known, and it has important applications in the interpolation and filtering 
of experimental data given on SO (3), see e.g. 



dist* eod)S o (n) (Qi,Q2) := HlogCQr 1 ^)!!!, -1 ^ spec(gr 1 Q 2 ) • (1-31) 

In cases (11.29) . (11.301) . (11.31) it is, contrary to (11.27) . the principal matrix logarithm that 
appears naturally. A common and desirable feature of all distance measures involving the 
logarithm presented above, setting them apart from the Euclidean distance, is invariance 
under inversion: d(X, I) = d(X~ 1 , 1) and d(X, 0) = +oo. We note in passing that 

d? og ,GLW X ' y ) := I' Log^F)!! 2 , (1.32) 

does not satisfy the triangle inequality and thus it cannot be a Riemannian distance metric 
on GL + (n, R). Further, X~ X Y is in general not in the domain of definition of the principal 
matrix logarithm. If applicable, the expression (11.32) measures in fact the length of curves 7 : 
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[0, 1] i — y .M , 7(0) = X, 7(1) = Y defining one-parameter groups 7(5) = X exp(s Log(X _1 F)) 
on the matrix Lie-group M.. Note that it is only if the manifold M. is a compact matrix Lie- 
group (like e.g. SO(n)) equipped with a bi-invariant Riemannian metric that the geodesies 
are precisely one-parameter subgroups jlQl Prop. 9]. This point is sometimes overlooked in 
the literature. 

1.3.3 A geodesic orthogonal Procrustes problem on SL(3). 
The Euclidean orthogonal Procrustes problem for Z,B e SL(3) 

min di8t&cHd(Z, BQ) = nun \\Z - BQ\\ 2 F (1.33) 

QeU(3) Q&U{3) 

has as solution the unitary polar factor of B*Z |TH Ch. 12]. However, any linear trans- 
formation of Z and B will yield another optimal unitary matrix. This deficiency can be 
circumvented by considering the straightforward extension to the geodesic case 

mindxst 2 seodtSH3) (Z,BQ). (1.34) 

In contrast to the Euclidean distance, the geodesic distance is by construction SL(3)-left- 
invariant: 

\/B G SL(3) : dist^ odiSL(3) (X,y)) = dist^ eod)SL(3) (BX, BY) (1.35) 
and therefore we have 

min dist2 eod SL{3) (Z, BQ) = min distjJ eod SL(3) (B^Z, Q) (1.36) 

with "another" geodesic optimal solution: the unitary polar factor of B~~ 1 Z, according to 
the results of this paper. 



2 Prelude on optimal rotations in the complex plane 

Let us turn to the optimal rotation problem fll.9p i : 



min ||Log(Q*Z)|| 2 

3eU(n) 



(2.1) 



In order to get hands on this problem we first consider the scalar case. We may always 
identify the punctured complex plane C \ {0} =: C x = GL(1,C) with the two-dimensional 
conformal group CSO(2) C GL + (2,IR) through the mapping 



a + ib ^ ZeCSO(2):={ 



a b 
—b a 



&Vo}. 



(2.2) 
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Let us define a norm || • || cso on CSO(2). We set ||X||£ so := \\\X\\ 2 F = |tr (X T X). Some 
useful connections between C x and CSO(2) are given in the appendix. 

Next we introduce the logarithm. For every invertible z G C \ {0} =: C x there always 
exists a solution to e v = z and we call t] G C the natural complex logarithm Log c (z) 
of z. However, this logarithm may not be unique, depending on the unwinding number 
[TTt p. 269]. The definition of the natural logarithm has some well known deficiencies: the 
formula Log c (u> z ) = zLog c (w) does not hold, since, e.g. in = Log c (l) = Log c ((— i) 2 ) ^ 
2Log c (— i) = 2(^y-) = —in. Therefore the principal complex logarithm jH p. 79] 

log : C x H> {z G C| -7T < 3m z < n} (2.3) 

is defined as the unique solution r\ G C of the equation 

e v = z <^ r] = log(z) := log \z\ + i arg(z) , (2.4) 

such that the argument arg(z) G (— 7r,7r]0 The principal complex logarithm is continuous 
(indeed holomorphic) only on the smaller set C \ (— oo,0]. Let us define the set T> := {z G 
C | \z — 1| < 1 }. In order to avoid unnecessary complications at this point, we introduce a 
further open set, the "near identity subset" T>^ C T>, containing 1 and with the property that 
z±,Z2 G £>" implies Z\Z 2 G T> and z{ 1 G T>. On D'cC x all the usual rules for the logarithm 
apply. On R + \ {0} all the logarithmic distance measures encountered in the introduction 
coincide with the logarithmic metric [HJ p. 109] (the "hyperbolic distance" p. 735]) 

dist k> g ,R+ fay) ■= I logtaT 1 !/)! 2 = | logy- log x| 2 , dist 2 og M+ (x, +1) := |log|x|| 2 . (2.5) 
This metric can still be extended to a metric on T>* through 

^\o Z ,vii. z i^ z 2) ■ = I logC^r 1 ^)! 2 , zi, z 2 G V i , 
dist 2 og ^(r ie ^,r 2 e^) = | log(rrV 2 )| 2 + |^ - tf 2 | 2 , r x ^\r 2 ^ G £>» . (2.6) 

Further, for z G we formally recover a version of f)2.5p 2 : min e « gI ,ti dist 2 og ^(e 1 ^, z) = 
| log |^| | 2 . We remark, however, that dist^^ does not define a metric on C x due to the 
periodicity of the complex exponential. Let us also define a log-Euclidean distance metric 
on C x , continuous only on C\ (— oo, 0], in analogy with (11.301) 

di stf og!euc ii d! cx(zi,Z2) : = | log 2:2 - log^il 2 = |log^|| 2 + |arg(z 2 ) -arg(^)| 2 . (2.7) 

\Zi I 

The identity 

distf^ -p^Z!, z 2 ) = dist logeuclidC x {zi, z 2 ) (2.8) 

1 For example log(— 1) = iir since e i7r = —1. Otherwise, the complex logarithm always exists but may 
not be unique, e.g. e _l7r = = —1. Hence Log c (— 1) = {in,— in, . . .}. For scalars, our definition of the 
principal complex logarithm can be applied to negative real arguments. However, in the matrix setting the 
principal matrix logarithm is defined only on invertible matrices which do not have negative real eigenvalues. 
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on is obvious but fails on C x . With this preparation, we now approach our minimization 
problem in terms of CSO(2) versus C x . For given Z £ CSO(2) we find that 

min ||Log(Q T Z)||^ so <S> min | Logc(e -< **)| 2 O min (dist logjC x (e^, z)) 2 . (2.9) 
Qeso(2) i?e(— 7r,7r] i?e(-7r,7r] 

It is important to avoid the additive representation dist 2 ogeuclidC x (z 1 , z 2 ), because in the 
general matrix setting Q and Z will in general not commute and the equivalence (I2.9p is 
then lost. If, however, Q £ SO(ra) and Z £ F(n) do commute, the optimality result is a 
trivial consequence of the Campbell-Baker-Hausdorff formula [T7I p.270,Thm. 11.3], since 
in tha t case minQ eSO („) || LogQ T Z\\ 2 F = mm QeS0 ^ || logZ - logQ|||, = || sym»logZ||f, = 
\\\ogVWZ\\ 2 F . 

The restrictions implied by working on D* are unduly hard, so we have also extended the 
distance function dist logj -pj defined on to a function dist logjC x defined on C x by sacrificing 
the metric properties and by using any complex logarithm Log c J§ In order to give this 
minimization problem a precise sense, we define 

min iLog^e^V)! 2 := mm {\w\ 2 \e w = e~ ii} z} = min {\w\ 2 \ e w = e^V arg(2) |z| } 

I?e(— 7T,7r] #e(— 7T,7r] 1?G(-7T,7r] 

= min {\w\ 2 | e w = e~^\z\} = min \Log c (e- M \z\)\ 2 . (2.10) 
i?e(-7r,7r] i?e(— 7r,7r] 

The solution of this minimization problem is agairj^l |log|z|| 2 . However, our goal is to 
introduce an argument that can be generalized to the non-commutative matrix setting. 
From \z\ > | JHe(,z)| it follows that 

min \Log c (e- i& z)\ 2 > min | *Ke(Log c ( e -^))| 2 = | log \z\\ 2 , (2.11) 

l9G(-7T,7r] 7T,7r] 

where we used the result (12.151) below for the last equality. The minimum for $ £ (— 7r, 7r] 
is achieved if and only if = arg(z) since arg(z) £ (—it, it] and we are looking only for 

$ £ (— 7T, 71"]. Thus 

min |Log c (e-^)| 2 = | log | ^ 1 1 2 - (2.12) 

l?G(-7T,7r] 

The unique optimal rotation Q($) £ SO (2) is given by the polar factor U p through $ = 
arg(z) and the minimum is |log|z|| 2 , which corresponds to minQ eSO (2) || Log(Q T Z) \\qso = 
IllogV^lPcso- 

Next, consider the symmetric minimization problem f |1.9jl 2 for given Z £ CSO(2) and its 
equivalent representation in C x : 

min ||sym,Log(Q T Z)||L min | £He(Log c (e-"^))| 2 . (2.13) 

QeSO(2) i?6(— 7r,7r] 



2 Note that writing min tfg (_„- ^ | log(e ti} z)\ 2 poses problems, since evaluating the principal complex log- 
arithm loge^ arg ' z - )_ ' ? '|z| would restrict d such that arg(z) — ■& e (— it, 7r]. 

3 min^ e( _ 7ri7r] |Log c (e- i,5 |0|)| 2 = min i , e( _ 7ri7r] | log \z\ + i(--d)\ 2 = min tfe( _ 7ri7r] |log|z|| 2 + |??| 2 = |log|z|| 2 . 
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Note that the expression distiog^cx^i; £2) := I Log c (z 1 1 z 2 )\ does not define a metric, 
even when restricted to T>K As before, we define 

min |^e(Log c (e-^))| 2 := min {| D\z w\ 2 \ e w = e~^z } (2.14) 

j?e(— 7T,7r] i?G(-7r,7r] 

and obtain 

min |9 : lc(Log c (e-^))| 2 = min | ^(Log^e"^^ arg(z) ))| 2 

1?6(— 7T,7r] 0G( — 7T,7r] 

= min |JHe(Log c (|z|e i(axg W^))| 2 (2.15) 

j?e(— 7T,7r] 

= min {| !iRe(log \z\ + z(arg(z) — $ + 2ir k))\ 2 , k E N} 

j?e(-7T,7r] 

= I log |z|| 2 . 

Thus the minimum is again realized by the polar factor U p , but note that the optimal 
rotation is completely undetermined, since d is not constrained in the problem. Despite the 
logarithm Log c being multivalued, this formulation of the minimization problem circumvents 
the problem of the branch points of the natural complex logarithm. This observation suggests 
that considering the generalization of (12.151) . i.e. minQ g u(n) || sym t Log(3*Z|| 2 in the first place 
is helpful also for the general matrix problem. This is indeed the case. 

With this preparation we now turn to the general, non-commutative matrix setting. 



3 Preparation for the general complex matrix setting 
3.1 Multivalued formulation 

For every nonsingular Z £ GL(n, C) there exists a solution X £ <C nxn to expX = Z which 
we call a logarithm X = Log(Z) of Z. As for scalars, the matrix logarithm is multivalued de- 
pending on the unwinding number [TTl p. 270] since in general, a nonsingular real or complex 
matrix may have an infinite number of real or complex logarithms. The goal, nevertheless, 
is to find the unitary Q £ U(n) that minimizes || Log(Q* Z)\\ 2 and || sym t Log(Q* Z)\\ 2 over 
all possible logarithms. 

Since || Log(Q*Z)||, || sym»Log(<5*^) || 2 > 0, it is clear that both infima exist. Moreover, 
U(n) is compact and connected. One problematic aspect is that U(n) is a non-convex set 
and the function X 1— >■ || LogX|| 2 is non-convex. Since, in addition, the multivalued matrix 
logarithm may fail to be continuous, at this point we cannot even claim the existence of 
minimizers. 

There is also some subtlety due to the non-uniqueness of the matrix logarithm depend- 
ing on the unwinding number. What should minQ 6 u( n ) || sym t LogQ*Z|| 2 mean? For any 
unitarily invariant norm we define, similar to the complex scalar case (I2.10p i and (12.141) 

min ||Log(Q*Z)|| 2 := min { \\X\\ 2 £ R I expX = Q*Z\ , 

QeU(n) QeU(n) 

min II sym t Log(Q*Z)|| 2 := min { II sym.Xll 2 £ R \ exp X = Q*Z] . (3.1) 

QeU(n) QeU(n) 
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We first observe that without loss of generality we may assume that Z 6 GL(n, C) is real, 
diagonal and positive definite, similar to (I2.15|L . To see this, consider the unique polar 
decomposition Z = U P H and the eigenvalue decomposition H = VDV* for real diagonal 
positive D = diag(di, . . . , d n ). Then, in analogy to ( J2.10p 2 . 



min || synuLog(Q*Z)|| 2 

QeU(n) 



min || 

QeU(n) 

{I 
{I 



mm 

QeU(n 

min 

QeU(ra 

min 

QeU(n 

min 

QeU(n 

min 

QGU(n 

min 

QeU(n 

min 

QGU(n 

min 

QgU(n 

min 

QeU(n 

min 

QeU(n 



syrmX"' 
sym„X 
sym^X 
sym„X 
sym„X 
syrmX 



\ 2 \ expX = Q*Z} 
| 2 | expX = Q*U P H} 
| 2 1 expX = Q*U p VDV*} 
| 2 | V*(expX)V = V*Q*U P VD} 
| 2 | exp(^*X^) = V*Q*U P VD} 
| 2 | exp(y*X^) = Q*L>} 

V*(sym t X)V\\ 2 \ exp(V*XV) = Q*D} 
syim(V*XV)|| 2 | exp(V*XV) = Q*D} 
sym^X)!! 2 | exp(X) = Q*D} 
syrmX|| 2 | expX = Q*D} 



sym„Log<5*-D| 



(3.2) 



where we used the unitary invariance for any unitarily invariant matrix norm and the fact 
that X H- sym»X and X H- expX are isotropic functions, i.e. f(V*XV) = V*f(X)V for 
all unitary V. If the minimum is achieved for Q = I in minQ g u( n ) || sym t Log(Q*-D)|| 2 then 
this corresponds to Q — U p in minQ e u( n ) || sym^LogQ*^!] 2 - Therefore, in the following we 
assume that D = diag(c? 1 , . . . d n ) with d\ > d 2 > . . . d n > 0. 
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3.2 Some properties of the matrix exponential exp and matrix 
logarithm Log 

Let Q G U(n). Then the following equalities hold for all X G C nxn . 

exp(Q*XQ) = Q* exp(X) Q , definition of exp, gfl p.715], (3.3) 

Q*Log(X)Q is a logarithm of Q*XQ , (3.4) 

det(Q*XQ) = det(X) , (3.5) 
exp(— X) = exp(X)" 1 , series definition of exp, j3J p. 713] , 

exp LogX = X , for any matrix logarithm , (3-6) 

det(expX) = e tr(x) , gl p.712] , (3.7) 
VF g C nxn , det(Y) ^ : det(F) = e tr(Logy) for any matrix logarithm [T7] . 

A major difficulty in the multivalued matrix logarithm case arises from 

VX G C nxn : Log exp X ^ X in general, without further assumptions. (3.8) 



3.3 Properties of the principal matrix-logarithm log 

The principal matrix logarithm Let X G C nxn , and assume that X has no real eigenval- 
ues in (—oo,0]. The principal matrix logarithm of X is the unique logarithm of X (the 
unique solution Y G C nxn of expY = X) whose eigenvalues are elements of the strip 
{z E C : —n < 3m(z) < it}. If X G IR nxn and X has no eigenvalues on the closed 
negative real axis R~ = (— oo,0], then the principal matrix logarithm is real. Recall that 
logX is the principal logarithm and LogX denotes one of the many solutions to exp Y = X. 
The following statements apply strictly only to the principal matrix logarithm [U p. 721]: 

log exp X = X if and only if | 3m A | < n for all A G spec(X) , 
log(X a ) = alogX, «G[-1,1], (3.9) 
log(Q*XQ) = Q* log(X) Q , VQ6U(n). 

Let us define the set of Hermitian matrices H(n) := {X G C nxn | X* = X } and the set 
P(n) of positive definite Hermitian matrices consisting of all Hermitian matrices with only 
positive eigenvalues. The mapping 

exp : H(n) i — y P(n) (3.10) 

is bijective [U p. 719]. In particular, Log exp sym„X is uniquely defined for any X G <C nxn up 
to additions by multiples of 2iri to each eigenvalue and any matrix logarithm and therefore 
we have 

V-FfGH(n): sym, Log if = log H , 
VX G C nxn : logexpsym,X = sym.X, (3.11) 
VX G C rtx ™ : sym, Log exp sym, X = sym.X . 
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Since expsym.X is positive definite, it follows from (I3.9[) 3 also that 

Vie C nxn : Q*(logexpsym*X)Q = log(Q*(exp sym,X)Q) . (3.12) 

4 Minimizing mm QelJ{n) \\ hog(Q*Z))\\ 2 

Our starting point is, in analogy with the complex case, the problem of minimizing 

min || sym,(Log(Q*Z))|| 2 , 
QeU(n) 

where sym,(X) = (X* + X)/2 is the Hermitian part of X. As we will see, a solution of this 
problem will already imply the full statement, similar to the complex case, see ( 12. lip . For 
every complex number z, we have 

\ e z \= e^" = \e n ' z \<\e nez \. (4.1) 

While the last inequality in ( 14. ip is superfluous it is in fact the "inequality" \e z \ < \e^ z \ 
that can be generalized to the matrix case. The key result is an inequality of Bhatia [HI 
Thm. IX.3.1], 

VIeC" x ": ||expX|| 2 < ||expsym,X|| 2 (4.2) 

for any unitarily invariant norm, cf. [TTl Thm. 10.11]. The result ( 14. 2ft is a generalization 
of Bernstein's trace inequality for the matrix exponential: in terms of the Frobenius matrix 
norm it holds 

|| expXH^ = tr (expX exp X*) < tr (exp (X + X*)) = || exp sym,X|| F , 

with equality if and only if X is normal jH p. 756], [2TJ p. 515]. For the case of the spectral 
norm the inequality ( 14. 2 p is already given by Dahlquist [12j (1.3.8)]. We note that the 
Golden-Thompson inequalities [U p.761],[2Tl Cor. 6. 5. 22(3)]: 

V X, Y G H(n) : tr (exp(X + Y)) < tr (exp(X) exp(F)) 

seem (misleadingly) to suggest the reverse inequality. 

Consider for the moment any unitarily invariant norm, any Q G U(n), the positive real 
diagonal matrix D as before and any matrix logarithm Log. Then it holds 

||exp(sym,Logg* J D)|| 2 > || exptLogQ*/))!! 2 = \\Q*D\\ 2 = \\D\\ 2 , (4.3) 

due to inequality (14. 2ft and 

|| exp(— sym„LogQ*-D)|| 2 = || exp(sym 4 ,(— LogQ*-D))|| 2 

> ||exp((-Lo g g* J D))|| 2 = IKexptLogQ*/)))- 1 !! 2 

= WiQwy 1 ]^ = iid-^q*)- 1 !! 2 = H/r 1 !! 2 , (4.4) 
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where we used (14.21) again. Note that we did not use — LogX = Log(X~ l ) (which may be 
wrong, depending on the unwinding number). 

Moreover, we note that for any Q G U(n) we have 

< det(exp(sym,LogQ*£>)) = e *(*ya.*"sQ m i>) = e xnv(Lo gQ *D) 

_ | e Kctr(LogQ*D)| _ | e tr(LogQ*D)| /^gx 

= |det(Q*D)| = |det(Q*)det(D)| 

= |det(Q*)| |det(D)| = |det(D)| = det(D) , 

where we used the fact that 

e tr W = det(exp X) , X = Log Q*D 

e tr(Lo g Q*z?) = det(exp Log Q*D) = det(Q*D) , (4.6) 

is valid for any solution X G <C nxn of expX = Q*D and that tr (sym» Log g*.D) is real. 

For any Q G U(n) the Hermitian positive definite matrices exp(sym»Log Q*D) and 
exp(— sym.Log Q*D) can be simultanuously unitarily diagonalized with positive eigenval- 
ues, i.e., for some Qi G U(n) 



n) i 



Ql exp(sym»LogQ*-D)Qi = exp(Ql(sym t Log Q* D)Qi) = diag(xi, . . . , x. 
Ql exp(- sym„Log(5*-D)(5i = exp(-<5i(sym,LogQ*D)(5i) 

= (exp(Qt(sym,LogQ* J D)Q 1 )- 1 = diag(-, ...,—), (4.7) 

X\ x n 

since X i— > expX is an isotropic function. We arrange the positive real eigenvalues in 
decreasing order x\ > X2 > . . . > x n > 0. For any unitarily invariant norm it follows 
therefore from 04.31) . (jOj) and ( 143]) together with ( jjTTJ that 



||diag(x 1 ,...,x ri )|| 2 = ||gtexp(sym t Logg* J D)g 1 || 2 = ||exp(sym,Logg* J D)|| 2 >|| J D|| 2 (4.8) 

|| diag(— , . . . , — )|| 2 = ||giexp(-sym,Logg* J D)g 1 || 2 = || exp(- sym.Log Q*D) || 2 > \\D~ l \f 

X\ x n 

detdiag(xi, . . . ,x n ) = det(g* exp(sym„Log Q*D)Q\) = det (exp(sym,, Log g*-D)) = det(D) . 
4.1 Frobenius matrix norm for n = 2,3 

Now consider the Frobenius matrix norm for dimension n — 3. The three conditions in (14. 8p 
can be expressed as 

x\ + x\ + x\ > d\ + d\ + d\ 



1 1 1 

X2 ,-y»2 /"/^ 

2 x 3 u l 

Xi X2 £3 = d\ C?2 ^3- 



2 + ™2 + ™2 — ,72 + ,72 + ,72 (^-^) 
1 2 3 1 2 3 
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By a new result: the "sum of squared logarithms inequality" [8], conditions (14. 9 p imply 

(logxx) 2 + (logx 2 ) 2 + (logx 3 ) 2 > (logdO 2 + (logd 2 ) 2 + (logrf 3 ) 2 , (4.10) 

with equality if and only if (x±, x 2 , x 3 ) = (di, d 2 , d 3 ). This is true, despite the map t \-t (logt) 2 
being non-convex. Similarly, for the two-dimensional case with a much simpler proof [5] 

xf + xj >d\ + d 2 2 \ 

i + i >4 + 4 i (logx 1 ) 2 + (logx 2 ) 2 >(logrf 1 ) 2 + (logrf 2 ) 2 . (4.11) 

1 2 1 2 

xi x 2 = d 1 d 2 J 

Since on the one hand (13.111) and (13.121) imply 

(logxi) 2 + (logx 2 ) 2 + (logx 3 ) 2 = || logdiag(xi,x 2 ,x 3 )||| 

= ||log(QIexp(sym.LogQ*£>)Q 1 )||£ (4.12) 

= ||Qilogexp(sym,LogQ* J D)Q 1 || 2 ^ 

= || logexp(sym t LogQ*D)||| = || sym,Logg*L>||^ 

and clearly 

(logc^) 2 + (logd 2 ) 2 + (logd 3 ) 2 = || \ogD\\ 2 Fl (4.13) 

we may combine (14.121) and (I4.13P with the sum of squared logarithms inequality (14.101) to 
obtain 

|| sym»LogQ*D||^ > || logD\\ 2 F (4.14) 
for any Q € U(3). Since on the other hand we have the trivial upper bound (choose Q = I) 

min ||sym,Log(Q*D)|| 2 7 < ||log£>||f,, (4.15) 

QeU(3) 



this shows that 



min || sym,Log(Q*D)|||- = || hgD\\ 2 F . (4.16) 



The minimum is realized for Q = I, which corresponds to the polar factor U p in the original 
formulation. Noting that 

|| Log(Q*D)\\ 2 F = || sym.Log(Q*D)\\ 2 F + || skew Log(Q*D) ||| > || sym.Log(Q*D) \\ 2 F (4.17) 

by the orthogonality of the Hermitian and skew-Hermitian parts in the trace scalar product, 
we also obtain 

min || Log(Q*D)\\ 2 F > min || sym.Log(Q*D)|||. = || logD|||, . (4.18) 

QeU(3) QeU(3) 

Hence, combining again we obtain for all /i > and all /i c > 

min /i||sym,Log(g* J D)|| 2 7 + /i c ||skewLog(g* J D)||| = /i||log J D|| 2 ,. (4.19) 

QeU(3) 

Observe that although we allowed Log to be any matrix logarithm, the one that gives 
the smallest || Log(Q*D)\\ F , || sym t Log(Q*D)\\ F is the principal logarithm. 
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4.2 Spectral matrix norm for arbitrary n G N 

For the spectral norm, the conditions ( 14.81) can be expressed as 



x\>d\, i>l, (4.20) 

x \ x 2 x 3 • • • — di d% C?3 ... G? n . 

This yields the following ordering 

< x n < . . . < d n < . . . < di < . . . < xt . (4.21) 
It is easy to see that this implies (even without the determinant condition (I4.20p 3 ) 

max{| logx n | , | logci n | , | log <ii | , | loga;i|} = max{| logx n | , | logxi|} , (4.22) 

which shows 

max{| logrf n | , | log c/i|} < max{| logx n | , | logxi|} . (4.23) 

Therefore, cf. f l4~T2|) . 

|| sym,Log<5*D||2 =|| logdiag(xi, . . . ,x n )|| 2 

= 11 diag(logXi, . . .,\ogx n )\\l 

= max {\logx 1 \,\\ogx 2 \,... ,\hgx n \} 2 

1=1,2,3, ...,n 

= . max {(logxi) 2 , (logx 2 ) 2 , . . • , (logx n ) 2 } 

i=l,2,3, ...,n 

= max{(logXi) 2 , (logx n ) 2 } 
> max{(logrfi) 2 , (logci n ) 2 } 



= max {(log^^Qog^) 2 ,-.. ,(logrf„) 2 } (4.24) 

i=l,2,3,...n 

= max {| logc?i| , | log cZ 2 1 , • • • , | log <i„|} 2 

1=1,2,3, ...,n 

= || diag(log di, . . . , log d„) ||| 
= ||logdiag(cZi,...,d n )|| 2 = HlogDH 2 , 
from which we obtain, as in the case of the Frobenius norm, due to unitary invariance, 

min ||sym,Log(Q* J D)|| 2 = \\\ogD\\ 2 2 . (4.25) 

QeU(n) 

For complex numbers we have the bound |z| > | £He z\. A matrix analogue is that the spectral 
norm of some matrix X G C nxn bounds the spectral norm of the Hermitian part sym^X, 
see [U p. 355] and [2TJ p. 151], i.e. ||X|| 2 > || sym^XH 2 ,. In fact, this inequality holds for all 
unitarily invariant norms [201 P-454]: 

VIeC nxn : ||X|| 2 > ||sym,X|| 2 . (4.26) 

Therefore we conclude that for the spectral norm, in any dimension we have 

min HLogtQ*/))!! 2 ^ min || sym.LogtQ*/)) || 2 = || lo gj D|| 2 , (4.27) 

QeU(n) QgU(n) 

with equality holding for Q = U p . 
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5 The real Frobenius case on S0(3) 

In this section we consider Z G GL + (3,IR), which implies that Z = U P H admits the polar 
decomposition with U p G SO (3) and an eigenvalue decomposition H = VDV T for V G SO (3). 
We observe that 

min || sym < ,Log(Q T D)||^ > min || sym*Log((5*-D)||^ • (5-1) 

QgSO(n) QeU(n) 

Therefore, for all fi > 0, fi c > we have, using inequality (15. ip 
min /i || sym,Log(<5 T 2') \\ F + [i c || skew t Log(<5 T Z) |||, 

QeSO(3) 

> min u || sym t Log(Q T Z) || % (5.2) 

QGSO(3) 

= M || symJog(i7jZ)||| = symJog(t/JZ)||| + /i c || skewJog(C/JZ)||| , 



and it follows that the solution to the minimization problem (II. 8p for Z G GL + (n, M.) and 
n = 2, 3 is also obtained by the orthogonal polar factor (a similar argument holds for n — 2). 

Denoting by dev n X = X — ^tr (X)/ the orthogonal projection of X G M nxn onto trace 
free matrices in the trace scalar product, we obtain a further result of interest in its own 
right (in which we really need Q G SO (3)), namely 

min || dev 3 Log(Q T D)||| = || dev 3 log D\\ 2 F , 

QGSO(3) 

min || dev3sym t Log(<5 T -D)||^ = || dev 3 log -D||| . (5.3) 

QeSO(3) 

As was in the previous section, it suffices to show (I5.3P 2 . This is true since by using (14. 6p 
for Q G SO (3) we have 

min || dev 3 sym,Log(<5 T D)||! = min ( \\ sjm,Log(Q T D)\\ 2 F - ]-ti (LogQ T D)' 

QgSO(3) QeSO(3) \ 3 



= min f||sym t Log(g T J D)|||-i(logdet(g T J D)) 2 
Qeso(3) \ 3 

= min ||sym,Log(Q T D)||^--(logdet( J D)) 2 (5.4) 

Qeso(3) 3 

= min || sym,Log(Q T £))|| 2 7 - -^tr (logD) 2 

Qeso(3) 3 

> min ||sym,Log(Q*D)||^-^tr(logD) 2 

QeU(n) 3 

= || sym.log-DH^ tr (log-D) 2 

3 

= || synUog-DUl - ^tr (synUog-D) 2 
= || devasymJogDlH = || dev 3 log D\\ 2 F . 
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6 Uniqueness 



We have seen that the polar factor U p minimizes both || Log(Q*Z)|| 2 and || sym t (Log(Q* Z) || 2 , 
but what about its uniqueness? Is there any other unitary matrix that also attains the 
minimum? We address these questions below. 

6.1 Uniqueness of U p as the minimizer of || Log(Q*Z)|| 2 

Note that the unitary polar factor U p itself is not unique when Z does not have full column 
rank (TTJ Thm. 8.1]. However in our setting we do not consider this case because Log(UZ) 
is defined only if UZ is nonsingular. 

We show below that U p is the unique minimizer of || Log(Q*Z)\\ 2 for the Frobenius 
norm, while for the spectral norm there can be many Q e U(n) for which || \og(Q* Z)\\ 2 = 

\\\og(u;z)f. 

Frobenius norm for n < 3. We focus on n = 3 as the case n = 2 is analogous and simpler. 
By the fact that Q = U p satisfies equality in (I4.18p . any minimizer Q of || Log(Q*D)\\p must 
satisfy 

|| Log(Q* D)\\ F = || sym,Log(g* J D)|| F = || logD|| F . (6.1) 

Note that by (I4.17P the first equality of (16. ip holds only if Log(Q*D) is Hermitian. 

We now examine the condition that satisfies the latter equality of (16.11) . Since Log(Q*D) 
is Hermitian the matrix exp(Log(Q*-D)) is positive definite, so we can write exp(Log(Q*-D)) = 
Ql diag(xi, x 2 , x-s)Qi for some unitary Qi and xi,x 2 ,x 3 > 0. Therefore 

\og(Q*D) = Q*diag(log2i,logx 2 ,loga:3)<2i. (6.2) 

Hence for || sym t Log(Q*D)|| F = || log_D||i? to hold we need 

(logxO 2 + (logx 2 ) 2 + (loga^) 2 = (log^) 2 + (logrf 2 ) 2 + (logrf 3 ) 2 , 

which is precisely the case where equality holds in the sum of squared logarithms inequality 
(I4.10p . As discussed above, equality holds in (I4.10p if and only if (xi,x 2 ,x 3 ) = (di, d 2) d 3 ). 
Hence by (16.21) we have log(Q*-D) = Ql diag(logXi, logx 2 , \ogx 3 )Q 1 = Q* 1 \og(D)Q 1 , so tak- 
ing the exponential of both sides yields 

Q*D = QIDQl (6.3) 

Hence QiQ*DQ\ = D. Since Q\Q* and Q\ are both unitary matrices this is a singular 
value decomposition of D. Suppose di > d 2 > d 3 . Then since the singular vectors of 
distinct singular values are unique up to multiplication by e 1 ^, it follows that QiQ* — Qi — 
diag(e i1?l , e 1 ^ 2 , e 1 ^ 3 ) for ^ 6 1, so Q = /. If some of the di are equal, for example if 
d\ = d 2 > d 3 , then we have Q\ = diag(<5i,i, e 41?3 ) where <5i,i is a 2 x 2 arbitrary unitary 
matrix, but we still have Q = I. If d\ = d 2 = d%, then Q\ can be any unitary matrix but 
again Q = I. Overall, for (I6.3P to hold we always need Q = I, which corresponds to the 
unitary polar factor U p in the original formulation. Thus Q = U p is the unique minimizer of 
|| Log(<5*-D)||F with minimum || \og(U*D)\\ F . 
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Spectral norm. For the spectral norm there can be many unitary matrices Q that attain 
||Log(Q*Z)||| = || \og(U*Z)\\\. For example, consider Z = [§}]. The unitary polar factor 
is U p = I. Defining U x = [\ e %] we have || log([AZ)|| 2 = || [J°]|| 2 = 1 for any ■& E [-1, 1]. 

Now we discuss the general form of the minimizer Q. Let Z = U"EV* be the SVD with 
£ = diag(cri, <t 2 , . . . , a n ). Recall that || log(?7*Z)|| 2 = max(| logcri(Z)|, | logcr„(Z)|). 

Suppose that || log(U*Z)\\ 2 = | log<7i(Z)| > | \oga n (Z)\. Then for any Q = C/diag(l, Q 22 )V* 
we have 

logQ*Z = logT/diag(l,Q 22 )£n 

so we have ||logQ*Z|| 2 = |log<7i(Z)| = || log U*Z\\ 2 for any Q 22 E U(n — 1) such that 
|| logQ 22 diag(cr 2 , a 3 , . . . , cx n ,)|| 2 < || \ogU*Z\\ 2 . Note that such Q 22 always includes I n -i, but 
may not include the entire set of (n — 1) x (n— 1) unitary matrices as evident from the above 
simple example. 

Similarly, if || log U*Z \\ 2 = \ \oga n (Z)\ > \ log a x {Z)\, then we have ||logg*Z|| 2 = ||log[/ p *Z|| 
for Q = U diag(Q 22 , 1)V* where Q 22 can be any (n — 1) x {n — 1) unitary matrix satisfying 
||logQ 22 diag(cri,cr 2 ,...,cr ri _i)|| 2 < || \ogU*Z\\ 2 . 

6.2 Non- uniqueness of U p as the minimizer of || sym H ,(Log(Q*Z)) || 2 

The fact that U p is not the unique minimizer of || sym,(Log(<5*2'))|| 2 can be seen by the 
simple example Z = I. Then LogQ* is a skew-Hermitian matrix, so sym < ,(Log(Q* Z)) = 
for any unitary Q. 

In general, every Q of the following form gives the same value of ||sym(Log(Q*Z))|| 2 . Let 
Z = UY,V* be the SVD with S = diagf <7i/ ni , cx 2 / n2 , . . . , (Jkln k ) where ri\ + n 2 + • • ■ + = n 
(k = n if Z has distinct singular values). Then it can be seen that any unitary Q of the form 

Q* = £/diag(Q ni , Q re2 , . . . , Q nk )V\ (6.4) 

where Q ni is any rij x rii unitary matrix, yields || sym„(Log(Q*Z))|| 2 = || sym„(log([/*Z))|| 2 . 
Note that this holds for any unitarily invariant norm. 

The above argument naturally leads to the question of whether U p is unique up to Q ni 
in (16.41) . In particular, when the singular values of Z are distinct, is U p determined up to 
scalar rotations Q ni = e 1 ^"* ? 

For the spectral norm an argument similar to that above shows there can be many Q for 
which || synu(Log(Q*Z))|| 2 = || sym.(log(£/£Z)) || 2 . 

For the Frobenius norm, the answer is yes. To verify this, observe in (14.141) that 
|| sym t Log(Q*-D)||F = ||log-D||F implies (xi,x 2 ,x 3 ) = (di,d 2 ,d 3 ) and hence Log(Q*D) = 
Ql diag(logc? 1 , logd 2 , logd 3 )Qi + S, where S is a skew-Hermitian matrix. Hence 

exp(Ql diag(logdi, \ogd 2 , log d 3 )Q x + S) =Q*D, (6.5) 

and by (14 .2p we have 

\\Q*D\\ F = ||exp(<2*diag(logdi,logd 2 ,logd 3 )<2i + 5*)|| F 

< ||exp(Qidiag(logdi,logd 2 ,logd 3 )(5i)|| F = ||Q*D|| F . 
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Since equality in ( 14. 2 p holds for the Frobenius norm if and only if X is normal (which 
can be seen from the proof of [BJ Thm. IX. 3.1]), for the last inequality to be an equality, 
Q\ diag(log di, log d 2 , log d 3 )Qi+S must be a normal matrix. Since Q\ diag(log di, log d 2 , log d 3 )Qi 
is Hermitian and 5* is skew-Hermitian, this means Q\ diag(log<ii, logc^, logc^Qi + S — 
Ql diag(zsi + logrfi, is 2 + \ogd 2 , is^ + logc^Qi for Sj £ R. Together with (16. 5p we conclude 
that 

Q*D = Ql diag(d ie ls \d 2 e ls \ 4e iS3 )Qi. 
By an argument similar to that following (16. 3p we obtain Q = diag(e _4Sl , e~ lS2 , e~ lS3 ). 



7 Conclusion and outlook 

The result in the Frobenius matrix norm cases for n = 2, 3 hinges crucially on the use of the 
new sum of squared logarithms inequality (I4.10p . This inequality seems to be true in any 
dimensions with appropriate additional conditions [8]. However, we do not have a proof yet. 

Nevertheless, numerical experiments suggest that the optimality of the polar factor U p 
in both 

min ||Log(Q*Z)|| 2 , min II sym»Log(Q*Z)|| 2 (7.1) 

QeU(n) QeU(n.) 

is true for any unitarily invariant norm, over M and C and in any dimension. This would 
imply that for all fi, fi c > and for any unitarily invariant norm 



min ii || sym t Log(Q*Z)|| 2 + /i c || skew,Log(Q*Z)|| 2 = /i || log([TZ)|| 2 = /i || log VZ^Z\\ 2 . 

QeU(n) 1 

We also conjecture that Q = U p is the unique unitary matrix that minimizes || Log(Q*Z)|| 2 
for every unitarily invariant norm. 

In a forthcoming contribution [35] we will use our new characterization of the orthogonal 
factor in the polar decomposition to calculate the geodesic distance of the isochoric part of 
the deformation gradient — — r £ SL(3) to S0(3) in the canonical left-invariant Riemannian 

det(F)3 V ' y ' 

metric on SL(3), namely based on (15. 3p 
p 

dist 2 eod . SL(3) ( r,SO(3)) = || dev 3 log VF T F\\ 2 F = min || dev 3 sym t LogQ T F||| . 

det(F)3 Qeso(3) 

Thereby, we provide a rigorous geometric justification for the preferred use of the Hencky- 
strain measure || log V F T F^\ 2 F in nonlinear elasticity and plasticity theory, see [THl H2] and 
the references therein. 
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8 Appendix 



8.1 Connections between C x and CSO(2) 

The following connections between C x and CSO(2) = K + • SO(2) are clear: 

1 



l^llcso — 2 H^H-P ' 



z = a — ib = a + i(—b) Z 

ZZ T = Z T Z 



a —b 
b a 



a 2 + 



1 



= \\Z\\ 2 cso = -tr(Z T Z)=-\\Z 



1 



F ! 



z- w = w- z & Z -W = W ■ Z , 



■1) 



3m(z) 



z + z 
2 

z — ~z 



a & sym,(Z) = -(Z + Z 7 



\Kz(z)\ 2 = \a\ 2 

|3m(z)| 2 = |6| 2 
\z\ 2 =mz(z) 2 +3m{z) 2 



b <=> skew,(Z) = -{Z - Z T ) 



a 
a 

' b 

-b 



sym^llcso = oil sym,Z\\ 2 F , 



II skew.Zll^go = -|| skew„Z|| F , 
det(Z) , 



•2) 



cos$ + isini9 



e -« z= ( e ") 



, e «| = | e a+i6| = Mz)\ 



cos 1? sin i? 
- sin fl cos $ 



£SO(2), ^€(-7r,7r] 



Q T Z, 



z| -o- Z = U p H , polar form versus polar decomposition 



\z\ <S> H = s/Z T Z = y/dct(Z)h 



U p = ZH~ 



1 



a b 
—b a 



€ SO(2), 



Va 2 + b 2 

<=> || exp(Z)|| F = || cxp(a/ 2 + skew„(Z))|| F 

= || exp(a/ 2 ) exp(skew t (Z))|| F = || cxp(a/ 2 )||F = II exp(sym»(Z))|| F 



•3) 



8.2 Optimality properties of the polar form 

The polar decomposition is the matrix analog of the polar form of a complex number 



z = e 



iarg(z) 



■4) 
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The argument arg(z) determines the unitary part e lars ^ while the positive definite Hermitian matrix is \z\. 
The argument arg(z) in the polar form is optimal in the sense that 

min fx\e~ i4 z - 1| 2 = min [x \z - e i§ \ 2 = [x \\z\ - 1| 2 , = arg(z) . (8.5) 

7T,7r] 7T,7r] 

However, considering only the real (Hermitian) part 

inw-i* n|2 /-"l^l- 1 ! 2 W^ 1 - ^ = arg(z) 
mm u yte e z— 1) =< . , s , (8.6 

*e(-ir,irr' V 71 |0 |z|>l, : cos(arg(z) - 0) = ^ , V 7 

shows that optimality of = arg(z) ceases to be true for |z| > 1 and the optimal is not unique. This is 
the nonclassical solution alluded to in (|1.6j) . In fact we have optimality of the polar factor for the Euclidean 
weighted family only for fi c > fi: 

min fi\<Rz( e - M z~l)\ 2 + n c \3m(e- M z-l)\ 2 = fi\\z\-l\ 2 , = arg(z) , (8.7) 

$e(-7T,7r] 

while for fi > \i c > there always exists a z € C such that 

min ^|«Re(e- W 2-l)| 2 +/z c |3m(e-^-l)| 2 < >Li 1 1 ^ | — 1 1 2 . (8.8) 

#e(— 7T,-7r] 

In pronounced contrast, for the logarithmic weighted family the polar factor is optimal for all choices of 
weighting factors fi, \i c > 0: 



i ™ T / «9 m9 i~ t / ,-,9 m9 ImUokUII 2 u c > : = arg(z) 
min At |SKeLog c ( e -^z)| 2 + Mc|3mLog c (e- rf z)| 2 = ^ 8 Mc SW 

<9e(— ir.ir] I fj, | log p| I /i c = : w arbitrary 



Thus we may say that the more fundamental characterization of the polar factor as minimizer is given by 
the property with respect to the logarithmic weighted family. 

8.3 The three- parameter case SL(2) by hand 

8.3.1 Closed form exponential on SL(2) and closed form principal logarithm 

The exponential on sl(2) can be given in closed form, see [HJ p. 78] and [5]. Here, the two-dimensional 
Caley-Hamilton theorem is useful: X 2 - tr (X)X + det(X)I 2 = 0. Thus, for X with tr (X) = 0, it holds 
X 2 = — det(X)l2 and tr (X 2 ) = — 2det(_X"). Moreover, every higher exponent X k can be expressed in / and 
X which shows that exp(X) = a{X)I^ + f3(X)X. Tarantola [41] defines the "near zero subset" of sl(2) 



s[(2) := {X e st(2) | 3m U hr {X*)j < n } (8.10) 

and the "near identity subset" of SL(2) 

SL(2)j := SL(2) \ {both eigenvalues are real and negative} . (8-11) 

On this set the principal matrix logarithm is real. Complex eigenvalues appear always in conjugated pairs, 
therefore the eigenvalues are either real or complex in the twodimcnsional case. Then it holds jT5] p. 149] 



'coshV-detpQ) h + si ^/- d ff)lx det(X)<0 

V — de t(X) 

cxp(X) = <j cos( ^dcT(X)) I 2 + ""^jff" X det(X) > ( 8 - 12 ) 

M+X det(X)=0. 
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Therefore 



cxp : s((2)o i — ^ SL(2)/ , exp(X) = cosh(s)/ 2 



sinh(s) 



X. 



-tr(X 2 ) = y/-dct(X). (8.13) 



Since the argument s = y/ —det(X) is complex valued for det(X) > we note that cosh(iy) = cos(y). 

The one-parameter SO (2) case is included in the former formula for the exponential. The previous 
formula (|8.13[) can be specialized to so(2,R). Then 



exp 



Q 
-a 



= cosh(v— a 2 )^ + 
= COS(|a|)/ 2 H rr 



sinh(y- 



\J — a 2 



a 
-a 



cosh(i|a|)/ 2 + 



sinh(i|a|) 



a 
-a 



" 


a 




cos a 


sin a 


—a 







— sin a 


cos a 



.14) 



We need to observe that the exponential function is not surjective onto SL(2) since not every matrix 
Z £ SL(2) can be written as Z = exp(X) for X £ sl(2). This is the case because Vie s((2) : tr (exp(X)) > 
—2, see (|8.12p 2 . Thus, any matrix Z £ SL(2) with tr (Z) < —2 is not the exponential of any real matrix 
X £ $1(2). The logarithm on SL(2) can also be given in closed form gU (1.175)]Q On the set SL(2)/ the 
principal matrix logarithm is real and we have 



tr(S) 



.15) 



log : SL(2)j ^ sl(2) , log [5] := — — (S - cosh(s)/ 2 ) , cosh(s) 

smh s 

8.3.2 Minimizing || \ogQ T D\\ 2 F for Q T D e SL(2) 7 
Let us define the open set 

K D := {Q £ SO(2) | Q T D £ SL(2)j } . 

— T 

On IZd the evaluation of log R D is in the domain of the principal logarithm. We are now able to show the 
optimality result with respect to rotations in TZd- We use the given formula (|8.15[) for the real logarithm on 



(8.16) 



4 In fact, the logarithm on diagonal matrices D in SL(2) with positive eigenvalues is simple. Their 
trace is always A+j >2ifA>0. We infer that cosh(s) = tr( ^ can always be solved for s. Observe 
cosh(s) 2 — sinh(s) 2 = 1. 
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SL(2)j to compute 



inf \\logR D\\%= inf ——\\R D - cosh(s)/ 2 ||| 
seRd Reno smh(s)^ 

2 

= m f . f, „ f ||^ T -D||| - 2cosh(s)(i? T J D,/,) + 2cosh(s) 2 
~R.£iZn smh(s)^ V 



(using cosh(s) = 



tr ( R T D 



= inf 


V 2 

{ 




sinh(s) 2 ' 


= inf 

RGlZn 


( 


)\R T d\\% 


sinh(s) 2 < 


= inf 

ReKn 


P ( 


'\\R T D\\% 


sinh(s) 2 ' 



2 

\2 i 



(8.17) 



tr(R T D) sinh(s) 2 / \ReTz 

\ReTZn ,cosh(>l-— ^ '- ' 



2 

optimality of the orthogonal factor 



tr (~r^ d \ sinh(s) 

\R£K D ,cosh(s) = — ^ '- 



dev 2 £>||| inf . . , , 2 

_ trfij^c) sinh(s) 2 

i?G7?.D ,cosh(s) = — i-j i 



2 



and on the other hand 



logD\\ 2 F = , ° - HD-ltr^Jalll, with sets, that cosh(s) = = J ( A + i 

smh(s) z 2 2 2 \ A 

1 



dev 2 D\\ F = , , . .„ -(Ai - A 2 )" since || dev 2 £>||£ = -(A x - A 



sinh(s) 211 ' sinh(s) 2 2 V 1 11 " 2' 

s 2 1,. l,o s 2 1,. 1. 



2) 



2 



sinh(s) 2 2^ A^ l + sinh(s) 2 -l 2^ A^ 

=^^>-b 2 =w^^ {x -^ (8 - i8) 

( arcosh (^+i))2 J i _ (arcosh(^±i)) 2 1 _ 1 
" i(A + i) 2 -l 2 lA A j I(A-I) 2 2 [A X> 

A + - 

= 2 (arcosh(— ^)) 2 = 2 (log A) 2 . 

The last equality can be seen by using the identity arcosh(a;) = log(x + \J x 2 — 1), x > 1 and setting 
x = i (A + t)j where we note that ar > 1 for A > 0. Comparing the last result (|8.18[) with the simple 
formula for the principal logarithm on SL(2) for diagonal matrices D G SL(2) with positive eigenvalues 
shows 



log 



A 


0" 




'log A 





i 

A_ 


111 = 11 





\% = (log A) 2 + (logl - log A) 2 = 2(logA) 2 . (8.19) 
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To finalize the SL(2) case we need to show that 

s 2 s 2 tr (D) 1 / 1 

inf • w no = ~~ • — t~t^\o » with s e K s. that cosh(s) = — i— ^ = - I A + — ) . (8.20) 

tr(n T n) smh(s) 2 smh(s) 2 2 2 V A ' 

i?.6TC D ,cosh(s) = — ^ i 



In order to do this, we write 



2 tr I i? T £) 



s 2 s 2 arcosh(£) 

mi — = mi = mi — tor E = 

_ trfi? T D) smh(s)^ _ tr(R T D) cosh(s)^ - 1 C ? — 1 

RETZo ,cosh(s) = — i-j i ReTZ D ,cosh(s) = — ^-5 — i 



arcosh(£) 2 



One can check that the function 

ff :[l,oo)->R+, .9(0: 

is strictly monotone decreasing. Thus g(£) = aic ° 2 s h(0 j s the smaller, the larger £ gets. The largest value 

tr(i? T r>) „ trf£ ,N 

for £ = — — - is realized by £ = ^ ; . Therefore 

. „ arcosh(0 2 arcosh(£) 2 s 2 , „ ™ , , , , tr (Z?) 

mf — - — ^- > — s ^- = — — — - , with s € K s. that cosh(s) = v ' . ■ 

i i 2 - 1 ~ £ 2 - 1 smh(s) 2 w 2 
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